Loop Transformations for Performance and Message Latency Hiding in Parallel Object-oriented Frameworks (extended Abstract)
نویسندگان
چکیده
Application codes reliably achieve performance far less than the advertised capabilities of existing architectures, and this problem is worsening with increasingly-parallel machines. For large-scale numerical applications, stencil operations are often impose the greater part of the computational cost, and the primary sources of ineeciency are the costs of message passing and poor cache utilization. This paper proposes and demonstrates optimizations for stencil and stencil-like computations for both serial and parallel environments that ameliorate these sources of ineeciency. Additionally, we argue that when stencil-like computations are encoded at a high level using object-oriented parallel array class libraries these optimizations, which are beyond the capability of compilers, may be automated.
منابع مشابه
Linear and Extended Linear Transformations for Shared-Memory Multiprocessors
Advances in program transformation frameworks have signi"cantly advanced compiler technology over the past few years. Program transformation frameworks provide mathematical abstractions of loop and data structures and formal methods for manipulating these structures. It is these frameworks that have allowed the development of algorithms capable of automatically tailoring an application for a ta...
متن کاملA New Communication and Computation Overlapping Model with Loop Sub-Partitioning and Dynamic Scheduling
The latency hiding techniques can significantly improve the performance of the parallel programs in distributed memory systems. This paper presents a communication and computation overlapping model to hide the communication latency in data parallel programs. The communication and computation overlapping model makes use of the loop subpartitioning scheme in which a given loop partition is partit...
متن کاملA Technique for Documenting the Framework of an Object-Oriented System
This paper presents techniques for documenting the design of frameworks for object-oriented qystems and applies the approach to the design of a configurable message passing system. The technique decomposes a framework into six concerns: the class hierarchy, protocols, control flow, synchronization, entity relationships and configurations of the system. An abstract description of each concern is...
متن کاملOptimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures
High-performance scientific computing relies increasingly on high-level large-scale object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental c...
متن کاملLatency Hiding in Parallel Systems: A Quantitative Approach
In many parallel applications, network latency causes a dramatic loss in processor utilization. This paper examines software pipelining as a technique for network latency hiding. It quanti es the potential improvements with detailed, instruction-level simulations. The benchmarks used are the Livermore Loop kernels and BLAS Level 1. These were parallelized and run on the instruction-level RISC s...
متن کامل